Mongodb to SQL Migration 플랜

기존의 고객 데이터베이스들을 SQL 기반 DB로 옮기는 것 까지는 기정사실이긴 한데, 구체적으로 어느 SQL로 가야 하는가? Postgres, MySQL, MariaDB, Oracle, Microsoft, SQLite 이렇게 다양한데 판단의 기준을 세워야 할 것 같다.

판단의 기준

1. 가격

엔터프라이즈 DB는 정말 더럽게 비싸다. 목표는 MongoDB보다 저렴하고 AWS 비용까지 고려했을 때 합리적인 가격이어야 한다.

https://aws.amazon.com/rds/pricing/

DB Type Instances Price Per Hour
RDS for Oracle License Included (LI) Single-AZ db.t3.small $0.088
RDS for Oracle License Included (LI) Multi-AZ db.t3.small $0.176
RDS for Oracle BYOL [1] Single-AZ Deployment db.t3.small $0.052
RDS for Oracle BYOL [1:1] Multi-AZ Deployment db.t3.small $0.104
RDS for MySQL Single-AZ Deployment db.t4g.micro $0.025
RDS for MySQL Single-AZ Deployment db.t4g.small $0.051
RDS for MySQL Multi-AZ Deployment db.t4g.micro $0.051
RDS for MySQL Multi-AZ Deployment db.t4g.small $0.102
Amazon RDS for PostgreSQL Single-AZ Deployment db.t4g.micro $0.025
Amazon RDS for PostgreSQL Single-AZ Deployment db.t4g.small $0.051
Amazon RDS for PostgreSQL Multi-AZ Deployment db.t4g.micro $0.051
Amazon RDS for PostgreSQL Multi-AZ Deployment db.t4g.small $0.102
RDS for MariaDB Single-AZ Deployment db.t4g.micro $0.025
RDS for MariaDB Single-AZ Deployment db.t4g.small $0.051
RDS for MariaDB Multi-AZ Deployment db.t4g.micro $0.051
RDS for MariaDB Multi-AZ Deployment db.t4g.small $0.102
RDS for SQL Server On-Demand Instances db.t3.micro $0.031
RDS for SQL Server On-Demand Instances db.t3.small $0.062

AWS Aurora는 MySQL, PostgreSQL 호환이 되는 고성능 데이터베이스 서비스이다. 두가지 버전이 있는데, IO에 요금을 부과하는 Aurora Standard와 스토리지 저장용량에 요금을 부과하는 Aurora I/O-Optimized 서비스가 존재한다.

AWS Aurora는 다양한 Capacity 옵션이 존재한다. Serverless, Provisioned On-Demand Instance, Provisioned Reserved Instance, Limitless.

AWS Aurora Serverless 옵션은 서버리스 특성상 자동으로 스케일링을 조절한다. 요청이 없으면 스스로 꺼지고 갑자기 많은 양의 트랜잭션이 요청되면 자원을 수평적으로 할당한다. Provisioned 옵션과 달리 항시 켜져있지 않기 때문에 사용시간을 특별한 ACU (Aurora Capacity Units) 단위를 가지고 요금을 책정하는데, 1초 단위로 아주 정밀하게 요금을 계산한다.

Measure Aurora Standard (per ACU hour) Aurora I/O-Optimized (per ACU hour)
Aurora Capacity Unit $0.20 $0.26

서버리스가 싫으면 Provisioned On-Demand Instance 혹은 Provisioned Reserved Instance를 사용하면 된다. 전자는 일반적인 RDS 인스턴스를 월 단위로 지불하는 방식이고, Reserved Instance는 한 번에 몇개월/몇년치 금액을 지불하여 쓰는 방식이다.

Standard Instances - Current Generation Aurora Standard (Price Per Hour) Aurora I/O-Optimized (Price Per Hour)
db.t4g.medium $0.113 $0.147
db.t4g.large $0.227 $0.295
db.t3.medium $0.125 $0.163
db.t3.large $0.25 $0.325

2. 기술지원

돈을 많이 내면 (엔터프라이즈 요금제를 사용하면) 각종 기술지원 및 사후지원이 제공된다. PostgreSQL도 엔터프라이즈 고객에 대한 기술지원이 제공된다고 하는데, 확실히 날 것 그대로의 오픈소스 보다는 기술보증이 탄탄하니까 돈이 오가는 크리티컬한 업계라면 고려해볼만하다.

3. 성능

어차피 NoSQL에서 넘어오는데... 아무 SQL DB를 사용해도 성능향상은 보장된다.

4. ORDB

PostgreSQL 때문에 고민이다. 얘는 그냥 RDBMS가 아니라 ORDBMS라서 "객체" 개념이 들어간다. Schema에 Function이 달릴 수 있는 영락없는 객체라고. 이걸로 어떤 차별성을 가질 수 있는지도 잘 모르겠고, 우리 도메인에 ORDB가 필요한지도 알아내야 한다.

5. Type

Enum, Uuid, Decimal, Json, Array같은 타입을 우리 서비스에서 사용하는가? 그러면 가능한 최대한 많은 타입을 지원하는 DB를 써야 유리할 것이다. 안 그러면 마이그레이션할 때 변환 로직을 추가해야 할 것이다.

2025-03-22 edited

PostgreSQL 쪽으로 선호도가 기울고 있다. 가장 큰 건 Json 타입으로, 값객체를 별도의 테이블이 아닌 자체의 컬럼으로 관리할 수 있을 뿐더러, JSON 프로퍼티를 쿼리하는 문법도 존재하기 때문에 --물론 이 방법은 아마도 raw query를 써야겠지만-- 마이그레이션에 상당한 이점을 가져갈 것으로 보인다. 냅다 마이그레이션을 할 수는 없는게, MongoDB의 ObjectID를 어떻게 PK 필드로 전환할 수 있을지와 릴레이션이 끊어진 테이블 간의 관계도를 재구축하여야 한다.

AI Response :: Claude 3.7 Sonnet

Based on your existing backend system built with NestJS, MongoDB, and Prisma ORM, I'll outline a comprehensive strategy for migrating to a SQL-based database. From your requirements and the notes in the paste.txt document, PostgreSQL appears to be the preferred option, so I'll focus primarily on that while highlighting considerations for other SQL databases.

1. PostgreSQL as the Primary Target

PostgreSQL offers several advantages that align with your needs:

2. Migration Strategy Overview

Phase 1: Schema Design & Preparation

  1. Map Document Schema to Relational Schema

    • Convert MongoDB collections to SQL tables
    • Define explicit relations between entities
    • Design primary and foreign key relationships
  2. Handle ObjectId Migration

    • Option 1: Convert ObjectIds to UUIDs (preferred for PostgreSQL)
    -- PostgreSQL schema example
    CREATE TABLE users (
      id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
      username VARCHAR(255) NOT NULL,
      /* other fields */
    );
    
    • Option 2: Preserve original ObjectId strings in a character field
    -- For compatibility during transition
    CREATE TABLE users (
      id VARCHAR(24) PRIMARY KEY,
      mongo_id VARCHAR(24) UNIQUE, -- Store original during migration
      /* other fields */
    );
    
  3. JSON Field Planning

    • Map embedded MongoDB documents to PostgreSQL JSONB fields
    • Example for address object in user model:
    CREATE TABLE users (
      id UUID PRIMARY KEY,
      /* other fields */
      address JSONB -- Store complex address object
    );
    
  4. Schema Validation

    • Create a test schema and validate with sample data
    • Confirm all domain requirements are fulfilled by the relational design

Phase 2: Implementation & Migration

  1. Update Prisma Schema

    • Modify your Prisma schema to work with PostgreSQL
    • Add explicit relations and constraints
    • Example:
    model User {
      id        String   @id @default(uuid())
      name      String
      email     String   @unique
      tickets   Ticket[]
      createdAt DateTime @default(now())
      updatedAt DateTime @updatedAt
      address   Json?    // For complex MongoDB documents
    }
    
  2. Database Migration Tooling

    • Use a combination of custom scripts and tools like MongoDB Compass for export
    • Consider leveraging AWS Database Migration Service if using AWS infrastructure
    • For complex migrations, implement a custom ETL pipeline
  3. Data Transformation Scripts

    • Create scripts to handle data transformation between document and relational models
    • Transform ObjectIds to UUIDs/strings
    • Resolve embedded documents to related tables
    • Handle arrays and complex MongoDB-specific structures
  4. Incremental Migration Approach

    • Migrate individual collections/modules separately
    • Start with less complex, less critical modules
    • Implement a dual-write strategy during transition
  5. API & Service Layer Updates

    • Modify service classes to handle SQL-specific operations
    • Update repository patterns/DAOs to work with relational data
    • Keep service interfaces consistent for client compatibility

Phase 3: Verification & Optimization

  1. Testing Strategy

    • Create comprehensive test suites for migrated functionality
    • Implement parallel testing against both databases
    • Performance testing to validate SQL performance gains
  2. Indexing Optimization

    • Identify high-volume queries from current system
    • Implement appropriate indexes for these queries
    • Example:
    CREATE INDEX idx_user_username ON users(username);
    CREATE INDEX idx_academy_owner ON academies(owner_id);
    CREATE INDEX idx_ticket_plan_type ON ticket_plans(type);
    
  3. Query Optimization

    • Analyze and optimize complex queries
    • Consider denormalization for performance-critical paths
    • Implement caching strategies where appropriate

3. Specific Migration Challenges & Solutions

Complex Document Structures

Challenge: MongoDB's flexible schema allows deeply nested documents and arrays.

Solution:

Example conversion:

// MongoDB document
{
  _id: ObjectId("..."),
  name: "Tennis Academy",
  address: {
    detail: "123 Main St",
    location: {
      coordinates: [127.1, 37.5],
      type: "Point"
    },
    oldAddress: "Old address",
    roadAddress: "Road address"
  },
  images: ["url1", "url2"]
}

PostgreSQL approach:

CREATE TABLE academies (
  id UUID PRIMARY KEY,
  name VARCHAR(255) NOT NULL,
  address JSONB,  -- Store complex address object as JSONB
  images TEXT[]   -- Use PostgreSQL array for simple string arrays
);

ObjectId References

Challenge: MongoDB uses ObjectId references between collections.

Solution:

CREATE TABLE academies (
  id UUID PRIMARY KEY,
  -- other fields
);

CREATE TABLE courts (
  id UUID PRIMARY KEY,
  academy_id UUID NOT NULL REFERENCES academies(id),
  -- other fields
);

Handling Transactions

Challenge: Moving from MongoDB's document-level atomicity to SQL transactions.

Solution:

async createAcademyWithCourts(data: CreateAcademyDto): Promise<Academy> {
  return this.prisma.$transaction(async (tx) => {
    const academy = await tx.academy.create({
      data: {
        name: data.name,
        // other fields
      },
    });
    
    for (const court of data.courts) {
      await tx.court.create({
        data: {
          name: court.name,
          academyId: academy.id,
          // other fields
        },
      });
    }
    
    return academy;
  });
}

4. Technical Implementation Details

Schema Conversion Examples

Here are specific schema conversion examples based on your Prisma models:

MongoDB Academy Model:

model Academy {
  id                 String               @id @default(auto()) @map("_id") @db.ObjectId
  name               String
  owner              String               @db.ObjectId
  // other fields
  address            AcademymodelsAddress
  // relations
  Tag               AcademyTag[]
  TicketPlan        TicketPlan[]
}

PostgreSQL Conversion:

model Academy {
  id                 String               @id @default(uuid())
  name               String
  owner              String               @db.Uuid
  // other fields
  address            Json                 // Store as JSONB in PostgreSQL
  // relations with explicit relations
  tags               AcademyTag[]
  ticketPlans        TicketPlan[]
  
  @@index([name])
  @@index([owner])
}

NestJS Service Layer Updates

Your NestJS services will need modifications to handle SQL-specific operations:

// Before (MongoDB)
async findUserTickets(userId: string): Promise<Ticket[]> {
  return this.prisma.ticket.findMany({
    where: {
      user: userId, // ObjectId reference
    },
  });
}

// After (PostgreSQL)
async findUserTickets(userId: string): Promise<Ticket[]> {
  return this.prisma.ticket.findMany({
    where: {
      userId: userId, // Explicit foreign key
    },
    include: {
      ticketPlan: true, // Explicitly include relations
    },
  });
}

Indexing Strategy

Based on your schema and likely query patterns:

  1. Primary Indexes:

    • All primary keys and foreign keys
  2. Performance Indexes:

    • User lookups: username, phone, email
    • Academy lookups: name, owner, location (PostgreSQL GiST index for geospatial)
    • Ticket queries: user, status, expireTime
    • Payment queries: orderId, status, createdTime
  3. Composite Indexes:

    • (academy, court, from, to) for reservation availability checks
    • (user, ticketPlan, isExpired) for active ticket queries
-- Example indexes for PostgreSQL
CREATE INDEX idx_ticket_user_status ON tickets(user_id, status);
CREATE INDEX idx_reservation_time ON reservations(court_id, start_time, end_time);
CREATE INDEX idx_academy_location ON academies USING GIST float, (address->>'latitude')::float), 4326);

5. PostgreSQL-Specific Advantages

  1. JSONB Operations:

    • Query inside JSON structures:
    SELECT * FROM academies WHERE address->'location'->'coordinates'->>0 > '127.0';
    
    • Index JSON properties:
    CREATE INDEX idx_academy_city ON academies ((address->>'city'));
    
  2. Array Operations:

    • Query array elements:
    SELECT * FROM courts WHERE 'basketball' = ANY(sports);
    
  3. Custom Types and Functions:

    • Create domain-specific types:
    CREATE TYPE reservation_status AS ENUM ('pending', 'confirmed', 'cancelled');
    
    • Implement business logic in database functions:
    CREATE FUNCTION check_reservation_availability(court_id UUID, start_time TIMESTAMP, end_time TIMESTAMP) 
    RETURNS BOOLEAN AS $
    -- Implementation
    $ LANGUAGE plpgsql;
    

6. Migration Timeline and Milestones

  1. Preparation Phase (3-4 weeks)

    • Schema design and mapping
    • Test environment setup
    • Migration script development
  2. Testing Phase (2-3 weeks)

    • Data migration verification
    • Performance testing
    • API compatibility testing
  3. Incremental Migration (6-10 weeks)

    • Migrate modules in order: Users → Academies → Courts → Ticket Plans → Tickets → Reservations → Payments
    • Implement dual-write for critical paths
    • Incremental cutover by module
  4. Optimization Phase (2-4 weeks)

    • Performance tuning
    • Index optimization
    • Query refinement
  5. Monitoring and Verification (2 weeks)

    • Extended monitoring period
    • Performance baseline establishment
    • Final optimization adjustments

7. Cost Considerations

Based on your workload:

  1. PostgreSQL on RDS:

    • Starting with a db.t4g.small ($0.051/hour) for development
    • Scaling to db.t4g.medium/large for production
    • Multi-AZ for production environment ($0.102/hour)
  2. PostgreSQL Aurora:

    • Consider Serverless v2 for variable workloads
    • Set minimum and maximum ACU based on typical load patterns
    • Estimated cost: $0.20-0.26 per ACU hour
  3. Migration Costs:

    • Development time for migration scripts
    • Potential dual-running costs during transition
    • AWS DMS costs if using that service

Conclusion

PostgreSQL offers the best balance of features, performance, and cost-effectiveness for your migration from MongoDB. The JSONB type provides a bridge between document and relational models, making the migration path smoother while giving you all the benefits of a robust relational database.

The migration strategy outlined above prioritizes:

By following this approach, you can successfully transition from MongoDB to PostgreSQL while improving your system's performance, maintainability, and cost-effectiveness.

AI Response :: o3-mini-high

Below is a comprehensive plan for migrating from MongoDB to a SQL-based system using Prisma ORM. This plan assumes you already have a well-structured Prisma schema for MongoDB and outlines the steps from schema conversion to data migration and verification.


1. Assess and Redesign the Data Model


2. Update the Prisma Schema


3. Set Up the SQL Database Environment


4. Plan and Execute Data Migration


5. Update Application Code and Deployment


6. Monitor and Optimize


Summary

This migration plan involves careful redesign of your data model, updating the Prisma schema, setting up a new SQL environment, and writing an ETL process for data migration. Testing at every stage is crucial to ensure data integrity and application stability. Once the migration is complete, updating your application configuration and monitoring performance will ensure a smooth transition from MongoDB to your new SQL-based system.

Feel free to reach out with any follow-up questions or for further details on specific parts of the migration process.



  1. BYOL - Bring Your Own Licence, 이미 해당 라이선스를 소유하고 있을 경우 ↩︎ ↩︎